Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Predicting polycystic ovary syndrome with machine learning algorithms from electronic health recordsIntroductionPredictive models have been used to aid early diagnosis of PCOS, though existing models are based on small sample sizes and limited to fertility clinic populations. We built a predictive model using machine learning algorithms based on an outpatient population at risk for PCOS to predict risk and facilitate earlier diagnosis, particularly among those who meet diagnostic criteria but have not received a diagnosis. MethodsThis is a retrospective cohort study from a SafetyNet hospital’s electronic health records (EHR) from 2003-2016. The study population included 30,601 women aged 18-45 years without concurrent endocrinopathy who had any visit to Boston Medical Center for primary care, obstetrics and gynecology, endocrinology, family medicine, or general internal medicine. Four prediction outcomes were assessed for PCOS. The first outcome was PCOS ICD-9 diagnosis with additional model outcomes of algorithm-defined PCOS. The latter was based on Rotterdam criteria and merging laboratory values, radiographic imaging, and ICD data from the EHR to define irregular menstruation, hyperandrogenism, and polycystic ovarian morphology on ultrasound. ResultsWe developed predictive models using four machine learning methods: logistic regression, supported vector machine, gradient boosted trees, and random forests. Hormone values (follicle-stimulating hormone, luteinizing hormone, estradiol, and sex hormone binding globulin) were combined to create a multilayer perceptron score using a neural network classifier. Prediction of PCOS prior to clinical diagnosis in an out-of-sample test set of patients achieved an average AUC of 85%, 81%, 80%, and 82%, respectively in Models I, II, III and IV. Significant positive predictors of PCOS diagnosis across models included hormone levels and obesity; negative predictors included gravidity and positive bHCG. ConclusionMachine learning algorithms were used to predict PCOS based on a large at-risk population. This approach may guide early detection of PCOS within EHR-interfaced populations to facilitate counseling and interventions that may reduce long-term health consequences. Our model illustrates the potential benefits of an artificial intelligence-enabled provider assistance tool that can be integrated into the EHR to reduce delays in diagnosis. However, model validation in other hospital-based populations is necessary.more » « less
-
Abstract Background Polycystic ovary morphology (PCOM) is an ultrasonographic finding that can be present in women with ovulatory disorder and oligomenorrhea due to hypothalamic, pituitary, and ovarian dysfunction. While air pollution has emerged as a possible disrupter of hormone homeostasis, limited research has been conducted on the association between air pollution and PCOM. Methods We conducted a longitudinal cohort study using electronic medical records data of 5,492 women with normal ovaries at the first ultrasound that underwent a repeated pelvic ultrasound examination during the study period (2004–2016) at Boston Medical Center. Machine learning text algorithms classified PCOM by ultrasound. We used geocoded home address to determine the ambient annual average PM 2.5 exposures and categorized into tertiles of exposure. We used Cox Proportional Hazards models on complete data ( n = 3,994), adjusting for covariates, and additionally stratified by race/ethnicity and body mass index (BMI). Results Cumulative exposure to PM 2.5 during the study ranged from 4.9 to 17.5 µg/m 3 (mean = 10.0 μg/m 3 ). On average, women were 31 years old and 58% were Black/African American. Hazard ratios and 95% confidence intervals (CI) comparing the second and third PM 2.5 exposure tertile vs. the reference tertile were 1.12 (0.88, 1.43) and 0.89 (0.62, 1.28), respectively. No appreciable differences were observed across race/ethnicity. Among women with BMI ≥ 30 kg/m 2 , we observed weak inverse associations with PCOM for the second (HR: 0.93, 95% CI: 0.66, 1.33) and third tertiles (HR: 0.89, 95% CI: 0.50, 1.57). Conclusions In this study of reproductive-aged women, we observed little association between PM 2.5 concentrations and PCOM incidence. No dose response relationships were observed nor were estimates appreciably different across race/ethnicity within this clinically sourced cohort.more » « less
-
Abstract The aim of this study is to determine the most informative pre- and in-cycle variables for predicting success for a first autologous oocyte in-vitro fertilization (IVF) cycle. This is a retrospective study using 22,413 first autologous oocyte IVF cycles from 2001 to 2018. Models were developed to predict pregnancy following an IVF cycle with a fresh embryo transfer. The importance of each variable was determined by its coefficient in a logistic regression model and the prediction accuracy based on different variable sets was reported. The area under the receiver operating characteristic curve (AUC) on a validation patient cohort was the metric for prediction accuracy. Three factors were found to be of importance when predicting IVF success: age in three groups (38–40, 41–42, and above 42 years old), number of transferred embryos, and number of cryopreserved embryos. For predicting first-cycle IVF pregnancy using all available variables, the predictive model achieved an AUC of 68% + /− 0.01%. A parsimonious predictive model utilizing age (38–40, 41–42, and above 42 years old), number of transferred embryos, and number of cryopreserved embryos achieved an AUC of 65% + /− 0.01%. The proposed models accurately predict a single IVF cycle pregnancy outcome and identify important predictive variables associated with the outcome. These models are limited to predicting pregnancy immediately after the IVF cycle and not live birth. These models do not include indicators of multiple gestation and are not intended for clinical application.more » « less
An official website of the United States government
